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Abstract — Soft-input soft-output (SISO) detection algorittims 
form the basis for iterative decoding. The associated computa- 
tional complexity often poses significant challenges for practical 
receiver implementations, in particular in the context of multiple- 
input multiple-output wireless systems. In this paper, we present 
a low-complexity SISO sphere decoder which is based on the 
single tree search paradigm, proposed originally for soft-output 
detection in Studer et al, IEEE J-SAC, 2008. The algorithm 
incorporates clipping of the extrinsic log-likelihood ratios in 
the tree search, which not only results in significant complexity 
savings, but also allows to cover a large performance/complexity 
trade-off region by adjusting a single parameter. 

L Introduction 

Soft-input soft-output (SISO) detection in multiple-input 
multiple-output (MIMO) systems constitutes the basis for 
iterative decoding, which, in general, achieves significantly 
better performance than decoding based on hard-output or soft- 
output-only detection algorithms. Unfortunately, this perfor- 
mance gain comes at the cost of a significant, often prohibitive 
(in terms of practical implementation), increase in terms of 
computational complexity. 

Implementing different algorithms, each optimized for a 
maximum allowed detection effort or for a particular system 
configuration, would entail considerable circuit complexity. A 
practical SISO detector for MIMO systems should therefore 
not only exhibit low computational complexity but also cover a 
wide range of easily adjustable performance/complexity trade- 
offs. 

The single tree search (STS) soft-output sphere decoder 
(SD) in combination with log-likelihood ratio (LLR) clip- 
ping [1] has been demonstrated to be suitable for VLSI 
implementation and is efficiently tunable between max-log 
optimal soft-output and low-complexity hard-output detection 
performance. The STS-SD concept is therefore a promising 
basis for efficient SISO detection in MIMO systems. 

Contributions: We describe a SISO STS-SD algorithm that 
is tunable between max-log optimal SISO and hard-output 
maximum a posteriori (MAP) detection performance. To this 
end, we extend the soft-output STS-SD algorithm described 
in [1] by a max-log optimal a priori information processing 
method that significantly reduces the tree-search complexity 
compared to, e.g., [2], [3], and avoids the computation of tran- 
scendental functions. The basic idea of the proposed approach 
is to incorporate clipping of the extrinsic LLRs into the tree 
search. This requires that the list administration concept and 
the tree pruning criterion proposed for soft-output STS-SD 
in [1] be suitably modified. Simulation results show that the 
SISO STS-SD with extrinsic LLR clipping attains close to 



max-log optimal SISO performance at remarkably low com- 
putational complexity and, in addition, offers a significantly 
larger performance/complexity trade-off region than the soft- 
output STS-SD in [1]. 

Notation: Matrices are set in boldface capital letters, vectors 
in boldface lowercase letters. The superscripts ^ and ^ stand 
for transpose and conjugate transpose, respectively. We write 
Ai,j for the entry in the ith row and jth column of the matrix 
A and bi for the ith entry of the vector b = [ 6i 62 • • • b^]'^ . 
I N denotes the N x N identity matrix. Slightly abusing com- 
mon terminology, we call an NxM matrix A, where TV > M, 
satisfying A^A = Im, unitary. \0\ denotes the cardinality of 
the set O. The probability of an event Z is denoted by P[Z]. 
X is the binary complement of a; G {+1, —1}, i.e., x ~ ~x. 

II. Soft-Input Soft-Output Sphere Decoding 

Consider a MIMO system with Mt transmit and Mn > Mt 
receive antennas. The coded bit-stream to be transmitted is 
mapped to (a sequence of) My -dimensional transmit symbol 
vectors s G O^^^, where O stands for the underlying complex 
scalar constellation and \0\ — 2*3. Each symbol vector s is 
associated with a label vector x containing MtQ binary values 
chosen from the set {+1,-1} where the null element (0 in 
binary logic) of GF(2) corresponds to +L The corresponding 
bits are denoted by Xjj,, where the indices j and b refer to the 
&th bit in the binary label of the jth entry of the symbol vector 
s = [si S2 ••■ smtV ■ The associated complex baseband 
input-output relation is given by 



Hs + n 



(1) 



where H stands for the Mfi x Mt channel matrix, y 
is the Afjj-dimensional received signal vector, and n is 
an i.i.d. circularly symmetric complex Gaussian distributed 
M/i-dimensional noise vector with variance No per complex 
entry. 

A. Max-Log LLR Computation as a Tree Search 

SISO detection for MIMO systems requires computation of 
the LLRs [4], [5] 



= log 



P[X, 



-i|y,H] 
-i|y,H] 



(2) 



for all bits j = 1, 2, . . . , Mt, & = 1, 2, . . . , Q, in the label x. 
Transforming (|2]) into a tree-search problem and using the 
sphere decoding algorithm allows efficient computation of 
the LLRs [6], [3], [1]. To this end, the channel matrix H 
is first QR-decomposed according to H = QR, where the 



Mfl X Mt matrix Q is unitary and the Mt x Mt upper- 
triangular matrix R has real-valued positive entries on its main 
diagonal. Left-multiplying by leads to the modified 
input-output relation y = Rs + Q^n, where y = Q^y- 
Noting that Q^n is also i.i.d. circularly symmetric complex 
Gaussian and using the max-log approximation leads to the 
intrinsic LLRs [4] 
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Fig. 1. Iterative MIMO decoder [4]. Tlie SISO STS-SD (corresponding to 
the dashed box) directly computes extrinsic log-likelihood ratios. 



in (|5]l and (|7]| can be computed recursively if the individual 
symbols sj (j = 1,2, Mt) are statistically independent, 
i.e., if P[s] = Y[j^=i -P['5j]- ^^11 '^his case, we have 



where X, 



(-1) 



and X, 



(+1) 



J J aiiu ^ are the sets of symbol vectors that 
have the bit corresponding to the indices j and b equal 
to —1 and +1, respectively. In the following, we consider 
an iterative MIMO decoder as depicted in Fig. [T] A soft- 
input soft-output MIMO detector computes intrinsic LLRs 
according to (j3]l based on the received signal vector y 
and on a priori probabilities in the form of the a priori 
~^zYi ' ^'^'^ delivers the extrinsic LLRs 



which can be evaluated recursively as d{s) = di, with the 
partial distances (PDs) 



LLRs Lf,^ 



A 1 / Pla;,- 6 = 



j^Mt,Mt-1, 



jE _ jD _ jA 



(4) 



the initialization dMT+i = 0, and the distance incre- 
ments (DIs) 



to a subsequent SISO channel decoder 

For each bit, one of the two minima in (|3]l corresponds to 
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logP[s'^'^^] (5) Note that the DIs are non-negative since — logP[sj] > 0. 



which is associated with the MAP solution of the MIMO 
detection problem 

^ ly-Rsf -logP[s]|. (6) 
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The other minimum in ([3]l can be computed as 

"In. ^ „2 



\MAP A 



sex^,^-" > 



No^ 



Rs 



logP[s] } (7) 



where the (bit-wise) counter-hypothesis x^^^ to the MAP 
hypothesis denotes the binary complement of the 6th bit in 
the label of the jth entry of s^^^^. With the definitions ^ 
and (|7]i, the intrinsic max-log LLRs in ([3]l can be written in 
compact form as 
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We can therefore conclude that efficient max-log optimal 
soft-input soft-output MIMO d etecti on reduces to efficiently 
identifying s^^^ , \^^^ , and A™^ (Vj, b). 

We next define the partial symbol vectors (PSVs) 
s*^-*^ = [sj Sj+i ■■■ smtV ^^'^ 110'^^ '^hat they can be ar- 
ranged in a tree that has its root just above level j — Mt and 
leaves, on level j = 1, which correspond to symbol vectors s. 
The binary-valued label vector associated with s'-'' will be 
denoted by x*^^'. The distances 

d(s) = ^||y-Rsf -logP[s] 



The dependence of the PDs dj on the symbol vector s is 
only through the PSV s'^^. Thus, the MAP detection problem 
and the computation of the intrinsic max-log LLRs have 
been transformed into tree-search problems: PSVs and PDs 
are associated with nodes, branches correspond to DIs. For 
brevity, we shall often say "the node s^^^" to refer to the 
node corresponding to the PSV s'^'. We shall furthermore 
use d(s(-'-') and (i(x(-'^) interchangeably to denote dj. Each 
path from the root node down to a leaf node corresponds 
to a symbol vector s e qMt ^ -pjjg solution of ^ and (j?]) 
corresponds to the leaf associated with the smallest metric 

in O^^^ and A^y " \ respectively. The SISO STS-SD uses 
elements of the Schnorr-Euchner SD with radius reduction [7], 
[8], briefly summarized as follows: The search in the weighted 
tree is constrained to nodes which lie within a radius r 
around y and tree traversal is performed depth-first, visiting 
the children of a given node in ascending order of their PDs. 
A node s'^^ with PD dj can be pruned (along with the entire 
subtree originating from this node) whenever the pruning 
criterion dj > is met. In order to avoid the problem of 
choosing a suitable starting radius, we initialize the algorithm 
with 7- = oo and perform the update ^ d{s) whenever a 
valid leaf node s has been reached. The complexity measure 
employed in the remainder of the paper corresponds to the 
number of nodes visited by the decoder including the leaves, 
but excluding the root. 

B. Tree Search for Statistically Independent Bits 

Consider the case where all Q bits corresponding to a 
symbol Sj are statistically independent and the MIMO detector 



obtains a priori LLRs L^^^ (Vj, b) from an external device, e.g., 
a SISO channel decoder as depicted in Fig. [T] We then have [9] 



Pkl = 



n 

6=1 



exp + Xj^h)L 
l + exp(L^4J 



(10) 



The contribution of the a priori LLRs to the DIs in (j9]l can be 
obtained from ( fTO] ) as 



logP[sj 



where the constants 



6=1 



(11) 



6=1 



\L 



log( 



l+cxp(-|L^^,|) (12) 



are independent of the binary-valued variables Xj^b and 
kj > for j = 1,2,.. --Mt- Because of - logP[sj] > 0, we 
can trivially infer from Jllll that — Yl'b=i \^i,bL^b + Kj > 0. 
From (|8]l it follows that constant terms (i.e., terms that are 
independent of the variables Xj ^, and hence of s) in (j5]l and (jT]) 
cancel out in the computation of the intrinsic LLRs and can 
therefore be neglected. A straightforward method to avoid the 
hardware-inefficient task of computing transcendental func- 



tions in (12 1 is to set Kj = in the computation of 



11 



This 

can, however, lead to branch metrics that are not necessarily 
non-negative, which would inhibit pruning of the search tree. 
On the other hand, modifying the DIs in (|9]l by setting 
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guaranteeing that, 
the so obtained 



since 
branch 



—Xj^bLjf^ + 

metrics are non-negative. Furthermore, as Kj > Kj, using 
the modified DIs leads to tighter, but, thanks to ([8]), still max- 
log optimal tree pruning, thereby (often significantly) reducing 
the complexity compared to that obtained through (|9]l. 

Note that in [10, Eq. 9], the prior term 
mated as 



11 



was approxi- 



■logPk 



Q . 



6=1 



Xj.bLf b ■ 



for \Lji^\ > 2 (b = 1,2, . . . ,Q) which corresponds exactly to 
what was done here to arrive at ( [T3] l. It is important, though, 
to realize that using the modified DIs ( [T3| ) does not lead to 
an approximation of ([sjl, as the neglected log(-) term in (12i 
does not affect (|8]l. 

III. Extrinsic LLR Computation in a Single 
Tree Search 

Computing the intrinsic max-log LLRs in (|8]l requires to 
determine A^^^ and the metrics A^';^^ associated with the 

counter-hypotheses. For given j and b the metric A^^^^ is 
obtained by traversing only those parts of the search tree that 



have leaves in X 



3,b 



MAP \ 



The quantities A'^'^^ and \f^^ 
can be computed using the SD based on the repeated tree 
search (RTS) approach described in [6]. The RTS strategy 
results, however, in redundant computations as (often signif- 
icant) parts of the search tree are revisited during the RTS 
steps required to determine )^^^ (Vj, b). Following the STS 
paradigm described for soft-output sphere decoding in [1], [3], 
we note that efficient computation of Lj^^ (Vj, b) requires that 
every node in the tree is visited at most once. This can be 
achieved by searching for the MAP solution and computing 
the metrics A^^^^ (Vj, b) concurrently and ensuring that the 
subtree emanating from a given node in the tree is pruned 
if it can not lead to an update of either A'^^^ or at least 
one of the A^^^^. Besides extending the ideas in [1] to take 
into account a priori information, the main idea underlying the 
SISO STS-SD presented in this paper is to directly compute 
the extrinsic LLRs L^^ through a tree search, rather than 
computing first and then evaluating (Wb. 

Due to the large dynamic range of LLRs, fixed-point hard- 
ware implementations need to constrain the magnitude of the 
LLR value. Evidently, clipping of the LLR magnitude leads 
to a degradation in terms of decoder performance. It has been 
noted in [1], [11] that incorporating LLR clipping into the 
tree search is very effective in terms of reducing complexity 
of max-log based soft-output sphere decoding. In addition, as 
demonstrated in [1], LLR clipping, when built into the tree 
search also allows to tune the detection algorithm in terms of 
performance versus complexity by adjusting the LLR clipping 
level. In the SISO case, we are ultimately interested in the 
extrinsic LLRs L^'j, and clipping should therefore ensure that 
\Lfb\ — -^max- It is hence sensible to ask whether clipping of 
the extrinsic LLRs can be built directly into the tree search. 
The answer is in the affirmative and the corresponding solution 
is described below. 

To prepare the ground for the formulation of the SISO STS- 
SD, we write the extrinsic LLRs as 
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will be referred to as the extrinsic metrics. For the following 
developments it will be convenient to define a function /(•) 
that transforms an intrinsic metric A with associated a priori 
LLR and binary label x to an extrinsic metric A according 
to 

X- , x = +l 



A = f{X,L^,x) = 



x = -1. 



(16) 



With this notation, we can rewrite ([TSjl more compactly as 
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. The inverse function of ( 16 1 



transforms an extrinsic metric A to an intrinsic metric A 
is defined as 

A + L^, x = +l 
K-L^, x = -l. 



X = f-'{A,L' 



ana 



(17) 



We emphasize that the tree search algorithm described be- 
low produces the extrinsic LLRs (Vj. b) in ( |T4| i rather than 
the intrinsic ones in ([8]l. Consequently, careful modification of 
the list administration steps, the pruning criterion, and the LLR 
clipping rules of the soft-output algorithm described in [1] is 
needed. 

A. List Administration 

The main idea of the STS paradigm is to search the subtree 
originating from a given node only if the result can lead to 
an update of either A'^''^^ or of at least one of the A^^^^. 
To this end, the decoder needs to maintain a list containing 
the label of the current MAP hypothesis x'^^^, the corre- 
sponding metric A'^'^^, and all QAIt extrinsic metrics A^^^^. 

The algorithm is initiahzed with A^^^ = A^^ = oo (V b). 
Whenever a leaf with corresponding label x has been reached, 
the decoder distinguishes between two cases: 

i) MAP Hypothesis Update: If d(x) < A'^'^^, a new MAP 
hypothesis has been found. First, all extrinsic metrics A^'j^^ 
for which Xj^b — ^Yb^^ updated according to 



A MAP 



/(A 



MAP tA 



MAP\ 



followed by the updates A 



MAP 



d(x) and x^AP 



In other words, for each bit in the MAP hypothesis that is 
changed in the update process, the metric associated with the 
former MAP hypothesis becomes the extrinsic metric of the 
new counter-hypothesis. 

ii) Extrinsic Metric Update: If d{x.) > A'^^^, only 
extrinsic metrics corresponding to counter-hypotheses might 
be updated. For each j = 1,2,..., Mr, b = 1,2,...,Q 
and /(d(x),i^^„x^¥AP) < AfAP^ ^^e 



,MAP 



with Xj^i, — ju,- 

SISO STS-SD performs the update 



A MAP 



(18) 



B. Extrinsic LLR Clipping 

In order to ensure that the extrinsic LLRs delivered by the 



algorithm indeed satisfy 
update rule 



< in 



(ij,b), the following 



A MAP 



I ^MAP ^MAP 



(19) 



has to be applied after carrying out the steps in Case i) of 
the list administration procedure described in Section UlI-AI 
We emphasize that for Lmax = oo the decoder attains max- 
log optimal SISO performance, whereas for Lmax = 0, the 
hard-output MAP solution (|6]l is found. 

C. The Pruning Criterion 

Consider the node s'^^ on level j corresponding to the label 
bits Xi,i, {i = j,j + l,..., Mt, b = 1, 2, ... , Q). Assume that 
the subtree originating from this node and corresponding to 
the label bits Xi^b (« = 1, 2, . . . , j — 1, 6 = 1, 2, . . . , Q) has not 
been expanded yet. The criterion for pruning the node s^^^ 
along with its subtree is compiled from two sets defined as 
follows: 

1) The bits in the partial label x'^' corresponding to the 
node s*^-') are compared with the corresponding bits in the 



label of the current MAP hypothesis. All extrinsic met- 
rics A^^^^ with Xj,b = x^b^^ found in this comparison 
may be affected when searching the subtree originating 
from s^^K As the metric (i(x(-'^) is an intrinsic metric, the 
extrinsic metrics A^^^^ need to be mapped to intrinsic 
metrics according to ( [T7] i. The resulting set of intrinsic 
metrics, which may be affected by an update, is given by 



A(x(^') = {r^(. 



A MAP tA ^MAp\ 
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2) The extrinsic metrics AM/^^ for 



1,2,.. 



1, 



b = 1,2, ... ,Q corresponding to the counter-hypotheses 
in the subtree of s'^^ may be affected as well. Corre- 
spondingly, we define 



A MAP tA 



MAP\ 
^'i,b J 



*<J,V5}. 



In summary, the intrinsic metrics which may be affected 
during the search in the subtree emanating from node s*^^^ 
are given by y^(x(j)) = {a/} = y^i (x'^)) U y^zlx^^)). The 
node s'^^ along with its subtree is pruned if the corresponding 
PD (i(x'-'') satisfies the pruning criterion 



Z(x(^)) 



> max 

a,e^(xO)) 



ai. 



This pruning criterion ensures that a given node and the entire 
subtree originating from that node are explored only if this 
could lead to an update of either A'^^^ or of at least one of 
the extrinsic metrics A^^^^. Note that A'^^^ does not appear 
in the set ^(x'^^-') as the update criteria given in Section UlI-AI 
ensure that A'^ is always smaller than or equal to all 
intrinsic metrics associated with the counter-hypotheses. 

IV. Simulation Results 

Fig. |2] shows performance/complexity trade-off curve^ for 
the SISO STS-SD described in Sections M and |lll] The 
numbers next to the SISO STS-SD trade-off curves correspond 
to normalized LLR clipping levels given by L^-aa^No- The 
average (over channel, noise, and data realizations) complexity 
corresponds to the cumulative tree-search complexity associ- 
ated with SISO detection over / iterations, where one iteration 
is defined as using the MIMO detector (and the subsequent 
channel decoder) once. The curve associated with I — 1 
hence corresponds to the soft-output STS-SD described in [1]. 
Increasing the number of iterations allows to reduce the SNR 
operating point (defined as the minimum SNR required to 
achieve a frame error rate of 1%) at the cost of increased com- 
plexity. We can see that performance improves significantly 
with increasing number of iterations. Note, however, that for 
a fixed SNR operating point, the minimum complexity is not 

' All simulation results are for a convolutionally encoded (rate 1/2, genera- 
tor polynomials [133o 171o], and constraint length 7) MIMO-OFDM system 
with Mt = Mji = 4, 16-QAM symbol constellation with Gray labeling, 
64 OFDM tones, a TGn type C channel model [12], and are based on 
a max-log BCJR channel decoder. One frame consists of 1024 randomly 
interleaved (across space and frequency) bits con'esponding to one (spatial) 
OFDM symbol. The SNR is per receive antenna. 
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Fig. 2. Performance/complexity trade-off of the SISO STS-SD with sorted 
QR decomposition (SQRD) as described in [13]. The numbers next to the 
curves coiTespond to normalized LLR clipping levels and / denotes the 
number of iterations over the MIMO detector (and the channel decoder). 
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Fig. 3. Peii'ormance/complexity (trade-off) comparison of the list sphere 
decoder (LSD) [4] and the SISO STS-SD (both using SQRD). The numbers 
next to the curves correspond to the list size for the LSD and to normalized 
LLR clipping levels for the SISO STS-SD. 



necessarily achieved by maximizing the number of iterations / 
as the trade-off region is parametrized by the LLR cHpping 
level and the number of iterations /. 

Fig. |3] compares the performance/complexity trade-off 
achieved by the list sphere decoder (LSD) [4] to that obtained 
through the SISO STS-SD. For the LSD we take the com- 
plexity to equal the number of nodes visited when building the 
initial candidate list. The (often significant) computational bur- 
den incurred by the list administration of the LSD is neglected 
here. We can draw the following conclusions from Fig. [3] 

i) The SISO STS-SD outperforms the LSD for all SNR 
values. 

ii) The LSD requires relatively large list sizes and hence a 
large amount of memory to approach max-log optimum 
SISO performance. The underlying reason is that the 
LSD obtains extrinsic LLRs from a list that has been 
computed around the maximum-likelihood solution, i.e., 
in the absence of a priori information. In contrast, the 
SISO STS-SD requires memory mainly for the extrinsic 
metrics. The extrinsic LLRs are obtained through a search 
that is concentrated around the MAP solution. Therefore, 
the SISO STS-SD tends to require (often significantly) 
less memory than the LSD. 

Besides the LSD, various other SISO detection algorithms 
for MIMO systems have been developed, see e.g., [5], [10], 
[14], [15]. For [5], [14] issues indicating potentially prohibitive 
computational complexity include the requirement for multiple 
matrix inversions at symbol- vector rate. In contrast, the QR de- 
composition required for sphere decoding has to be computed 
only once per frame. The computational complexity of the 
list-sequential (LISS) algorithm in [10], [15] seems difficult 
to relate to the complexity measure employed in this paper 
However, due to the need for sorting of candidate vectors in 
a list and the structural similarity of the LISS and the LSD 
algorithms, we expect their computational complexity behavior 
to be comparable as well. 
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